Statistical Mechanics of Histories: A Cluster Monte Carlo Algorithm 
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We present an efficient computational approach to sample the histories of nonlinear stochastic 
processes. This framework builds upon recent work on casting a d-dimensional stochastic dynamical 
system into a d + f-dimensional equilibrium system using the path integral approach. We introduce 
a cluster algorithm that efficiently samples histories and discuss how to include measurements that 
are available into the estimate of the histories. This allows our approach to be applicable to the 
simulation of rare events and to optimal state and parameter estimation. We demonstrate the utility 
of this approach for 4> 4 Langevin dynamics in two spatial dimensions where our algorithm improves 
sampling efficiency up to an order of magnitude. 
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I. INTRODUCTION 



Onsager and Machlup 1] pioneered the path ensemble 
approach to classical stochastic processes in 1953, only 
a few years after Feynman's seminal work on quantum 
systems 0- Despite nearly simultaneous origins how- 
ever, the computational application of this framework to 
classical systems lagged far behind its quantum counter- 
part. While Monte Carlo methods were applied to lattice 
gauge theory and quantum condensed matter systems in 
the early 1970's 3], only within the past decade has the 
Onsager-Machlup approach become practical for compu- 
tational modelling of classical nonequilibrium processes. 

Following the analytical results of Domany for 2+1 
dimensional Potts models, computational work began 
with Zimmer who devised a Monte Carlo algorithm 
to sample entire space-time configurations, or histories, 
of a kinetic Ising model. This work demonstrated the 
utility of the Monte Carlo approach where histories can 
be conditioned on rare events. Olender and Elber used 
a similar approach to circumvent the time limitations of 
molecular dynamics simulations, specifically to find reac- 
tion pathways when both the initial and final states are 
known. See the work of Chandler et al. k|, Jonsson et 
al. and others using this methodology 0, El ■ 

In this paper we extend the computational work by 
presenting a percolation-based cluster Monte Carlo ap- 
proach to sample the statistical mechanics of histories for 
nonlinear stochastic processes. We also describe how to 
apply this method to rare event simulations and optimal 
estimation. The cluster algorithm we present improves 
the statistical sampling of histories in Monte Carlo sim- 
ulations significantly. In traditional spatial cluster algo- 
rithms [1 1| . the clusters represent statistically indepen- 
dent objects at a given time. In the d + 1 dimensional 
mapping we introduce, the clusters can be interpreted as 
statistically independent objects in space-time. 

Our goal is to determine the conditional statistics of 
histories for a stochastic dynamical system x(i), given 
a model and incomplete information about that system. 
The state vector x satisfies the following Ito process: 

<fx(t) = f(x,t)dt + (2D(x,<)) 1/2 dW(i), (1) 



where f(x, t) is the force term, and the stochastic ef- 
fects are provided by the last term, in which the diffu- 
sion matrix D acts on a vector- valued Wiener process, 
W. We assume that the noise errors are uncorrelated, 
and that the initial value x(io) is a random variable with 
a known distribution. This system could represent the 
configuration of a protein evolving under Brownian dy- 
namics 0, the concentration of interacting metabolites, 
the locations of atoms in a crystal undergoing a structural 
phase transition or nucleation, or the state of a queue in 
a stochastic fluid model. The final state can also be a 
rare event on which the history is conditioned. For in- 
stance, the configuration of an unfolded protein chain can 
be conditioned in the initial state and the folded protein 
in the final state. 

The probability of the dynamics generating a given 
history is simply related to the probability that it expe- 
riences a certain noise history, r](tk) = W(ifc+i) — W(tfe), 
at times tk, where k — 0,1, ... ,T. We incorporate this 
probability into the discretized form of Eq. (JJJ . In the in- 
terest of simplicity, we use the explicit Euler-Maruyama 
discretization scheme. This leads to the following: 



x fe+ i = x fe 



f (x fc , t k )At + (2D(x fc , tk)) 1 ' 2 ^). (2) 



For Gaussian uncorrelated white noise with variance 
(rjrj 1 ) = 6(t — t'), the probability distribution of noise is 
P{r](t)} oc exp(-i£fc \r](t k )\ 2 /At). The probability of a 
specific history is given by, P{i](t)} oc exp(— S) where S 
is the action of the (i-dimensional system (equivalent to 
the Hamiltonian of the d+ 1-dimensional system). By re- 
arranging terms in Eq. [21 the form of the action becomes 

S = ^ T^{l" Xfc + 1 ~ Xfc ~ f(xfc,tfc)At] T D(xfc,^ fc )~ 1 



fc=0 



[xfc+i - x fc - f(x k ,t k )At] | 



(3) 



where T indicates the transpose. With action S, the 
statistics of histories of the time dependent, stochastic 
dynamical system has been cast as an equilibrium statis- 
tical mechanical system. 

Now let us incorporate the information about the sys- 
tem into the action functional. For simplicity, we will 
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assume that the information comes at discrete times t m 
where m labels each observation m — 1, . . . , M. These 
observations (e.g. experimental measurements) are given 
in a function, h(x, t), and it is assumed to have errors 
denoted here by e(x, t), i.e., 

y(^m: ^m) h(x m , £ m ) ~\~ f-rrn 

with error covariance, (ee') = R m . By using Bayes' 
rule |10| . the action arising from measurements becomes 
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(4) 



The action-functional, Stotai = S + Smi assigns weights 
to individual histories. In the absence of additional infor- 
mation, histories unlikely to arise from the dynamics are 
given a lower weight than histories which are more likely. 
However, when there are measurements, histories which 
are far from the measurements are given lower weight 
than those closer to the measurements. 



II. A SPACE-TIME CLUSTER ALGORITHM 

To sample the distribution of histories and hence to as- 
sign weights to them, various methods have been applied 
(including local Monte Carlo, unigrid and generalized hy- 
brid Monte Carlo [Toj ) • Here we describe a space-time 
cluster algorithm which is an extension of the embedded 
dynamics algorithm introduced by Brower and Tamayo 
(BT) [l4(- Cluster algorithms are widely used in physics, 
statistics and computer science [l3|. The first of these 
was introduced by Swendsen and Wang (SW) (llj which 
is based on a mapping between the Potts model and a 
percolation problem [T^] ■ 

Brower and Tamayo extended the SW algorithm to a 
continuous field theory by embedding discrete variables 
(spins) into the continuous field in an equilibrium classi- 
cal 4> 4 model ^3 . The <fi 4 potential is a symmetric double 
well potential of the form: 



y(r,t) = (a/4)^ 4 (r,t)-(6/2)0 2 (r,t) 



(5) 



The discrete spin variables, s r , label the two wells in <fi 4 
potential such that cf> r — s r \(j> r \. At fixed values of \4>(r)\ 
a ferromagnetic Ising model is embedded into the <fi field 
theory which allows the use of the SW dynamics. The 
detailed procedure of the embedded dynamics is as fol- 
lows: 

• Update <fi r via a standard local Monte Carlo algorithm. 

• Form percolation clusters dictated by the bond proba- 
bility, 



P, 



, =1- e -/3 r ,'(l + sX.) = I _ e -(l*r||#.| + *r#.) 



where the effective spin-spin coupling is /3 rr < = |0 r $.|. 
Note that p rr i reduces to 1 — exp(— 2/3 rr /) when the spins 



are the same sign. 

• Update the Ising variables by flipping the percolation 
clusters independently with probability 1/2. If the move 
is accepted, flip the sign of the fields in the cluster. 

To extend the embedded dynamics to space-time, we 
need to redefine the clusters based on the discretized 
dynamical equation and the corresponding action as in 
Eqs.[21and[3] Next we illustrate this formalism with the 
<fi 4 field theory in (2 + 1) dimensions. 

We consider the discretized Langevin equation, 

0(r , t + At) = 0(r , t) + [ 0(r< , t) - 40(r , t) 

i 

+At[-a<fi(r,t) 3 + b(fi(r,t)] +VAtr ? (r,f) ) (6) 

where the force term is the derivative of Eq. with 
respect to 0, and is sum over the nearest neigh- 
bors of </>(r, i). The noise variables T)(r,t) are chosen to 
be Gaussian distributed, independent random variables 
of mean zero and with correlations (rj(r, t)rj(r' , £')) = 
2D8 r ^'S(t — t'). For this model the action becomes 



s = 5 ^53(^ r ' t+A *)-^( r '*)- At 

r,t 

+ b(fi{r,t)] -At[^(r;,i)-4<£(r,f)]) 



a(fi 3 (r,t) 



(7) 



By expanding the square in the right side of Eq. we 
obtain many cross terms representing different couplings 
between neighbors both in space and time. All of the in- 
teractions between a site and its neighbors in space and 
time are shown explicitly by Zimmer [5|]. Excluding the 
local terms (e.g. <fi(r, t) 2 ), the interactions yielding dif- 
ferent spin-spin couplings can be grouped into four types 
(using (rj,tk) as the reference site): 

1. Nearest neighbors of (rj,i/-_i) coupled to (rj,t k ): 

ft =2At(^0(ri,4-i)) cfi(rj,t k ). 

i 

2. Site (rj-,£fe_i) coupled to (r^i/-): 



(26-8)At-2aAt 2 ( ri , f fe _i)+2 (j>{r j ,t k )(j>{r j ,t k - 1 ) 



3. Nearest neighbors of (r,, t k ) coupled to each other: 

ft = -At 2 ( ]T 0(r, , t fc )) ( **)) • 

i i 

4. Nearest neighbors of (r^, t k ) coupled to (rj,t k ): 
& = (E <l>(Ti, t fc )) ([(8 - 26)Ai 2 - 2At]<fi(r j ,t k ) 

i 

+2aAi 2 3 (r J ,i, £ )). 

The probability of a site having a bond with any of its 
neighbors is 



Pi 



1 



-2ft/(4-D At) 



(8) 
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where i — 1, ... ,4. A significant difference from BT is 
that the sign of /3j is not known a priori. Depending on 
the value of <j), the interaction can be either ferromagnetic 
or antiferromagnetic ^5|. At each step we determine 
whether the coupling term is ferromagnetic (/3j > 0) or 
antiferromagnetic < 0) and require the signs of spins 
to be the same or opposite respectively for a bond to 
exist. Once the clusters are defined, we use the same 
steps as BT described earlier in text. 

Next, we compare the performance of this cluster 
method to two other algorithms, local Monte Carlo and 
unigrid [To| . To quantify performance, we measured the 
correlation time of a quantity M = \^2<fi\, the sum of 
fields at all space and time points. This quantity is anal- 
ogous to the magnetization of a spin system. Because M 
is a global quantity, it is one of the slowest modes of the 
system We remind the reader that our cluster algo- 
rithm updates the fields by changing the sign of fields in 
a flipped cluster. Therefore, by taking the absolute value 
of the fields we are left with the true correlations. The 
correlation time, r, is obtained by fitting exp(— tjr) to 
the autocorrelation function defined as (M to+t M to ). 
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FIG. 1: The cluster size distribution at noise D = 25. The 
size distribution scales as n s ~ s -2 ' 2 . 

show the performance of the local, cluster and unigrid al- 
gorithms at the same noise strength (D = 25) for differ- 
ent system sizes. The cluster algorithm correlation times 
are much smaller than the local r's and comparable to 
the unigrid algorithm. 



TABLE I: Correlation times of the magnetization M for local 
and cluster algorithms for several noise strengths, D. The 
system dimensions are L = 10 and T = 100, the acceptance 
ratio, a ~ 0.5, At — 0.05 and Ax = 1.0. The length of the run 
was 100,000 MCS, and the data analyzed for the last 80,000 
MCS. The cluster algorithm is fastest at D « 25. 



D Tlocal ^cluster 

1 947 775~ 

5 180 134 

15 25 8.8 

20 19 2.9 

25 12 1.4 

30 9 1.1 



TABLE II: Correlation time, r, of magnetization, M , with 
the local, cluster and unigrid algorithms for different system 
sizes, L and T with D = 25. The cluster algorithm is a factor 
of nine faster than the local one, and comparable to unigrid. 
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32 


12.2 


1.74 


1.54 


16 


128 


11.1 


1.50 


1.80 


32 


512 


13.3 


1.79 


1.62 



III. MEASUREMENTS 



The performance of the cluster algorithm depends on 
several factors. For a fair comparison of our algorithm to 
the local one, we used an acceptance ratio of a « 0.5 
for which the local algorithm is empirically most effi- 
cient. The correlation times are highly dependent on 
noise strength (proportional to the square of the temper- 
ature) as is expected from any algorithm. We measure 
r's at different noise strengths for a system of spatial di- 
mension, L — 10 with periodic boundary conditions and 
time dimension, T = 100 with open boundary conditions. 
In Table |U these times are shown for the local and clus- 
ter algorithms as characterized by the decay of Cm(*)- 
The cluster algorithm performs only slightly better than 
the local algorithm at low noise strengths, and it is most 
beneficial at D » 25 with nine times more efficiency. At 
this noise strength, the cluster size distribution scales as 



-2.2 



as shown in Fig. 



We also compared the performance of the cluster algo- 
rithm to a unigrid algorithm 10] which has been shown 
to speed up the dynamics significantly. In Table [HI we 



Thus far we have not included any measurements or 
local fields in the system. In forecasting complex sys- 
tems (e.g. weather) it is crucial to make use of data 
available to predict the path of the system. The cluster 
algorithm we have introduced is especially useful where 
some measurements are available. As illustrated in Eq.|3 
the action corresponding to the measurements, Sm, can 
be added to the action, S, in Eq. 



S 



M 



E 



2^ 



(9) 



where (f> m is the value of 4> measured at r m , t m with error 
variance er^. The cluster algorithm can be easily modi- 
fied to incorporate the measurements. The spin-spin cou- 
plings defined earlier remain the same because the mea- 
surements are added to the action separately and are in- 
dependent of the dynamics. However the cluster flipping 
probability must be adjusted since it costs more/less to 
flip the sign of a spin if there is a measurement at that 
point. The local field at a site is analogous to having a 
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measurement in our case. Dotsenko et al. |l7j have dis- 
cussed the probability of flipping a site in an Ising model 
when there are local fields at that site. In the presence of 
external field h, the probability of flipping a cluster gets 
weighted by the local fields, i.e., 

pmp = exp(±^2hj)/[exp(^2hj) + exp(- ^hj)], 

3 3 3 

(10) 

which reduces to pm P = 1/2 as expected for h = 0. 

Let us now derive the probability of flipping a cluster 
in the presence of measurements. Expanding the square 
in the right hand side of the action in Eq. [5] yields only 
one coupled term, — 2</>(r m , £ m )0 m (r m , i m ). With this 
coupled term, the flipping probability becomes 

PRip = e£ m 2|0(r,t)|0 m (r,t) + e £ m -2\4>{r ,t)\<j> m (v ,t) ' ( U ) 

We set artificial measurement points such that the system 
is initially in the positive well (at t — 0), and it transi- 
tions into the negative well forced by the measurements. 
We measured the probability distribution function (pdf) 
of 4> using the cluster algorithm as shown in Fig. The 
pdf obtained using the local algorithm agrees with this 
pdf as expected. In Table ITTT1 we show the performance 
of both algorithms for different system sizes (D = 25) 
with four measurements points of variance a 2 = 0.01. 
The cluster algorithm consistently outperforms the local 
algorithm in the presence of the measurements. 



0.005 




FIG. 2: Probability distribution function pf <f> obtained by 
the cluster algorithm. The system is L — 10, T = 20, and 
D = 1 with measurement points, 4> m , placed at every space 
point at every three time slices, <j> m (t < T/2) = 1 and 4> m {t > 
T/2) = — 1 with a standard deviation of a = 0.02. The system 
is driven to the negative well forced by the measurements. 



IV. DISCUSSION 

In this paper, we have described a cluster Monte Carlo 
algorithm to sample space-time histories of a nonlinear 



TABLE III: Correlation times of M for local and cluster al- 
gorithms with measurements at different system sizes at noise 
strength, D — 25. The cluster algorithm consistently outper- 
forms the local one. 
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16 128 


11.9 
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24 288 


12.3 


1.7 



stochastic process. This approach can be applied to 
study pathways to rare events as well as for optimal state 
and parameter estimation. 

At the noise strength where the cluster size distribu- 
tion scales, the cluster algorithm outperforms the local 
Monte Carlo updates significantly. We have not observed 
scaling of magnetization correlation times as a function 
of system size, therefore the observed speedup is indepen- 
dent of the system size. The noise strength required to 
observe this scaling depends on the size of the space-time 
domain. For the finite (and relatively small) systems we 
have studied in this paper, this noise does not correspond 
to the critical temperature in the original D dimensional 
system. 

Although the efficiency of our algorithm is comparable 
to the unigrid algorithm, it can be preferred over the un- 
igrid method when the observation of the clusters as cor- 
related structures is of interest. The clusters are statis- 
tically independent space-time events, and the temporal 
(time-axis) extent of these objects provides an estimate 
of their lifetime. For instance in nucleation process, the 
correlated structures in the system, e.g. droplets, signify 
the fluctuations of the metastable equilibrium and 
it is of interest to measure the lifetime of these droplets 
directly. In the future we plan to use this method to sim- 
ulate the Ginzburg-Landau equation (model A) in order 
to study nucleation and find the distribution of the life- 
times (t) of clusters to test theoretical predictions . 

Our method is applicable to more general potentials 
arising from other nonlinear stochastic partial differential 
equations such as Cahn-Hilliard-Cook equation which en- 
ables the study of spinodal decomposition. 



Acknowledgments 

We thank G. L. Eyink, W. Klein, J. Machta, and S. 
K. Mitter for useful discussions. This work (LA-UR 05- 
8402) was funded partly by the Department of Energy 
under contracts W-7405-ENG-36 and the DOE Office 
of Science's ASCR Applied Mathematics Research pro- 
gram, and partly by Center for Nonlinear Studies. 



5 



[1] L. Onsager and S. Machlup, Phys. Rev. 91, 1505 (1953). 

L. Onsager, Phys. Rev. 38, 2265 (1931). 
[2] R. P. Feynman and A. R. Hibbs, Quantum Physics and 

Path Integrals, New York: McGraw-Hill, (1965). 
[3] J. B. Kogut, Rev. Mod. Phys. 51, 659 (1979). M. Suzuki, 

Prog. Theor. Phys. 56, 1454 (1976). H. F. Trotter, Proc. 

Am. Math. Soc. 10, 545 (1959). 
[4] E. Domany, Phys. Rev. Lett. 52, 871 (1984). 
[5] M. F. Zimmer, Phys. Rev. Lett. 75, 1431 (1995). 
[6] C. Dellago, P. Bolhuis, F. Csajka, and D. Chandler, J. 

Chem. Phys . 108, 1964 (1998). 
[7] R. Olender and R. Elber, J. Chem. Phys. 105, 9299 

(1996). 

[8] H. Jonsson, G. Mills and K. W. Jacobsen, Classical and 
Quantum Dynamics in Condensed Phase Simulations, 
(World Scientific, Singapore, 1998). 

[9] W. E, W. Ren, and E. Vanden-Eijnden, Phys. Rev. B 66, 
52301 (2002). D. M. Zuckerman and T. B. Woolf, Phys. 
Rev. E 63, 016702 (2000). D. Passerone and M. Par- 
rinello, Phys. Rev. Lett. 87, 108302 (2001). R. J. Allen, 



D. Frenkel, P. R. ten Wolde, cond-mat/0509499 (2005). 
[10] F. J. Alexander, G. L. Eyink and J. M. Restrepo, J. Stat. 

Phys. 119, 1331 (2005). 
[11] R. H. Swendsen and J.-S. Wang, Phys. Rev. Lett 58, 86 

(1987). 

[12] C. M. Fortuin and P. W. Kasteleyn, Physica 57, 536 
(1972). 

[13] A. Barbu and S. C. Zhu, IEEE Trans. PAMI (2005). 
[14] R. C. Brower and P. Tamayo, Phys. Rev. Lett. 62, 1087 
(1989). 

[15] J.-S. Wang, R. H. Swendsen and R. Kotecky, Phys. Rev. 

Lett. 63, 109 (1989). 
[16] R. Toral, in Third Granada Lectures in Computational 

Physics: Proceedings of the III Granada Seminar on 

Computational Physics, (Springer- Verlag, Berlin, 1995). 
[17] VI. S. Dotsenko, W. Selke and A. L. Talapov, Physica A 

170, 278 (1991). 
[18] C. Unger and W. Klein, Phys. Rev. B 29, 2698 (1984). 



